Picture for Ngan Le

Ngan Le

DRIVESPATIAL: A Benchmark for Spatiotemporal Intelligence in VLMs for Autonomous Driving

Add code
May 22, 2026
Viaarxiv icon

CodeGraphVLP: Code-as-Planner Meets Semantic-Graph State for Non-Markovian Vision-Language-Action Models

Add code
Apr 24, 2026
Viaarxiv icon

SemLT3D: Semantic-Guided Expert Distillation for Camera-only Long-Tailed 3D Object Detection

Add code
Apr 20, 2026
Viaarxiv icon

GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding

Add code
Mar 26, 2026
Viaarxiv icon

SIGMA: A Physics-Based Benchmark for Gas Chimney Understanding in Seismic Images

Add code
Mar 24, 2026
Viaarxiv icon

DuFal: Dual-Frequency-Aware Learning for High-Fidelity Extremely Sparse-view CBCT Reconstruction

Add code
Jan 21, 2026
Viaarxiv icon

Clutter-Resistant Vision-Language-Action Models through Object-Centric and Geometry Grounding

Add code
Dec 27, 2025
Viaarxiv icon

Rethinking Progression of Memory State in Robotic Manipulation: An Object-Centric Perspective

Add code
Nov 18, 2025
Viaarxiv icon

SlotVLA: Towards Modeling of Object-Relation Representations in Robotic Manipulation

Add code
Nov 10, 2025
Viaarxiv icon

Learning Human Motion with Temporally Conditional Mamba

Add code
Oct 14, 2025
Viaarxiv icon